首页> 外文OA文献 >A new variance-based approach for discriminative feature extraction in machine hearing classification using spectrogram features
【2h】

A new variance-based approach for discriminative feature extraction in machine hearing classification using spectrogram features

机译:一种新的基于方差的方法,用于使用频谱图特征进行机器听力分类中的歧视性特征提取

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Machine hearing is an emerging research field that is analogous to machine vision in that it aims to equip computers with the ability to hear and recognise a variety of sounds. It is a key enabler of natural human-computer speech interfacing, as well as in areas such as automated security surveillance, environmental monitoring, smart homes/buildings/cities. \ud\udRecent advances in machine learning allow current systems to accurately recognise a diverse range of sounds under controlled conditions. However doing so in real-world noisy conditions remains a challenging task. Several front-end feature extraction methods have been used for machine hearing, employing speech recognition features like MFCC and PLP, as well as image-like features such as AIM and SIF. The best choice of feature is found to be dependent upon the noise environment and machine learning techniques used. Machine learning methods such as deep neural networks have been shown capable of inferring discriminative classification rules from less structured front-end features in related domains. In the machine hearing field, spectrogram image features have recently shown good performance for noise-corrupted classification using deep neural networks. However there are many methods of extracting features from spectrograms. This paper explores a novel data-driven feature extraction method that uses variance-based criteria to define spectral pooling of features from spectrograms. The proposed method, based on maximising the pooled spectral variance of foreground and background sound models, is shown to achieve very good performance for robust classification.
机译:机器听力是一个与机器视觉类似的新兴研究领域,它旨在使计算机具备听到和识别各种声音的能力。它是自然人机语音接口以及自动化安防监控,环境监控,智能家居/建筑物/城市等领域的关键推动力。 \ ud \ ud机器学习的最新进展使当前的系统能够在受控条件下准确识别各种声音。然而,在现实的嘈杂条件下这样做仍然是一项艰巨的任务。几种前端特征提取方法已用于机器听力,它采用了语音识别功能(例如MFCC和PLP)以及类似图像的功能(例如AIM和SIF)。发现功能的最佳选择取决于噪声环境和所使用的机器学习技术。已经显示出诸如深度神经网络之类的机器学习方法,能够从相关领域中结构化程度较低的前端特征中推断出判别性分类规则。在机器听力领域,频谱图图像特征最近显示出了使用深度神经网络进行噪声损坏分类的良好性能。但是,有很多方法可以从频谱图中提取特征。本文探讨了一种新颖的数据驱动特征提取方法,该方法使用基于方差的标准来定义频谱图中特征的频谱池。所提出的方法基于最大化前景和背景声音模型的合并频谱方差,显示出对于鲁棒分类具有非常好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号